proposal: EvidenceAnchor — pluggable external anchoring for agt-evidence.json by giskard09 · Pull Request #2244 · microsoft/agent-governance-toolkit

giskard09 · 2026-05-13T11:29:01Z

Follows up on #2208 and the design direction outlined by @Ricky-G.

What this adds

A design proposal for backend-agnostic external anchoring of compliance evidence, structured exactly as requested in #2208:

Goals / non-goals and threat model — explicit that anchoring proves non-modification after anchor time, not correctness at write time
EvidenceAnchor plugin interface — Python ABC with anchor() and verify()
Canonical action_ref derivation — fully specified encoding rules so two independent implementations produce byte-identical hashes
agt-evidence.json schema changes — additive only, optional anchors array, no breaking changes
CLI semantics — agt verify --anchor runnable by auditor with no AGT runtime state
Three reference backends in priority order: filesystem/WORM, Sigstore Rekor, Mycelium Trails (plugin PR after core lands)
Compliance mapping — EU AI Act Art. 12, SOC 2 CC7.x, ISO 27001 A.12.4, FCA SYSC 9.1, Basel III BCBS 239

Relationship to verifiable-compliance-receipts.md

Complementary. arian-gogani's proposal covers receipt signing. This proposal covers external anchoring — making the receipt survive the infrastructure that generated it. Both are needed.

Cross-system compatibility

action_ref canonicalization is compatible with azender1/SafeAgent RFC_EXECUTION_GUARD.md and the joint interface spec at giskard09/argentum-core#7 (DashClaw x SafeAgent x Mycelium).

github-actions · 2026-05-13T11:29:18Z

🤖 AI Agent: test-generator — `evidence_anchor.py`

`evidence_anchor.py`

test_anchor_failure_modes -- Validate behavior for enforce, queue, and best_effort modes when anchoring fails.
test_verify_status_transitions -- Test all possible transitions of AnchorVerifyStatus (e.g., VERIFIED, NOT_FOUND, HASH_MISMATCH, BACKEND_UNAVAILABLE).
test_append_only_conformance -- Ensure backends prevent modification or deletion of anchored records.
test_action_ref_canonicalization -- Confirm action_ref derivation produces byte-identical hashes across implementations.
test_plugin_registration_security -- Validate explicit plugin registration and detect potential security risks in auto-discovery.

`cli_verify.py`

test_cli_exit_codes -- Ensure CLI exit codes correctly map to AnchorVerifyStatus values.
test_verify_without_operator_network_access -- Confirm verification works with only evidence file and public anchor metadata, without AGT runtime state.

`agt_evidence_schema.py`

test_anchor_status_field -- Validate all possible anchor_status values (anchored, pending, failed, skipped) are correctly handled.
test_empty_anchors_behavior -- Ensure verification behaves correctly when anchors field is absent or empty.
test_schema_changes_backward_compatibility -- Confirm new schema fields do not break existing functionality.

github-actions · 2026-05-13T11:29:19Z

🤖 AI Agent: breaking-change-detector — API Compatibility

API Compatibility

Severity	Change	Impact
Moderate	Introduction of `EvidenceAnchor` ABC with `anchor()` and `verify()` methods.	Potentially breaking for any existing code or plugins that interact with AGT's evidence handling and are not updated to implement the new interface.
Low	Changes to `agt-evidence.json` schema with new optional `anchors` array and `anchor_status` field.	Backward-compatible for existing evidence files, but downstream systems parsing `agt-evidence.json` must handle the new fields gracefully.
Low	CLI changes for `agt verify --anchor`.	May affect scripts or tools relying on the previous CLI behavior if they do not account for the new `--anchor` option.

github-actions · 2026-05-13T11:29:20Z

🤖 AI Agent: security-scanner — View details

No security issues found.

github-actions · 2026-05-13T11:29:21Z

🤖 AI Agent: contributor-guide — View details

Hi @giskard09! 👋 Thanks for this detailed and well-structured proposal—great job outlining the goals, design, and compliance mapping! 🚀

Before we can merge:

Please ensure the new MYCELIUM-EXTERNAL-ANCHOR-PROPOSAL.md file is linked in the main documentation index for discoverability.
Confirm that the action_ref derivation aligns with the existing utility in agt-core or clarify if a new utility will be created.

Check out our CONTRIBUTING.md for more details. Let us know if you need help! 😊

github-actions · 2026-05-13T11:29:24Z

🔴 Contributor Check: HIGH

Check	Result
Profile	HIGH
Credential	NONE
Overall	HIGH

Automated check by AGT Contributor Check.

github-actions · 2026-05-13T11:29:28Z

🤖 AI Agent: docs-sync-checker — Docs Sync

Docs Sync

EvidenceAnchor interface in MYCELIUM-EXTERNAL-ANCHOR-PROPOSAL.md -- missing docstrings for anchor() and verify() methods.
README.md -- no updates to reflect the new EvidenceAnchor feature or its implications.
CHANGELOG.md -- missing entry for the addition of the EvidenceAnchor feature and related changes to agt-evidence.json schema and CLI semantics.

Please address these documentation gaps.

github-actions · 2026-05-13T11:29:29Z

🤖 AI Agent: code-reviewer — Action Items:

TL;DR: 0 blockers, 2 warnings. Proposal is well-structured and aligns with security and compliance goals, but minor improvements are suggested.

#	Sev	Issue	Where
1	Warn	No explicit mention of rate-limiting or DoS mitigation for anchor backends.	EvidenceAnchor interface and backend implementation guidelines.
2	Warn	Lack of detailed examples for `metadata` field in `AnchorReceipt`.	EvidenceAnchor interface and schema changes for `agt-evidence.json`.

Action Items:

None.

Warnings (fine as follow-up PRs):

#	Description
1	Add guidance or requirements for rate-limiting and DoS protection in anchor backends.
2	Provide detailed examples of `metadata` field usage for different backends.

github-actions · 2026-05-13T11:29:39Z

PR Review Summary

Check	Status	Details
🔍 Code Review	⚠️ Warning	See details
🛡️ Security Scan	✅ Passed	No issues found
🔄 Breaking Changes	✅ Completed	Analysis complete
📝 Docs Sync	✅ Completed	Analysis complete
🧪 Test Coverage	✅ Completed	Analysis complete

Verdict: ⚠️ Ready for human review

giskard09 · 2026-05-13T11:58:08Z

@microsoft-github-policy-service agree

Ricky-G · 2026-05-14T08:47:48Z

@giskard09 Thanks for turning #2208 into a structured proposal — this is the right shape and the additive-schema choice is exactly right. Comments below, grouped by severity.

Blocking (technical)

action_ref canonicalization isn't yet deterministic. "Concatenated with || … with no separator" is contradictory; there are no length prefixes (so ab|cd and abcd| collide); scope is typed as a string but agt-evidence.json — external anchor for independently verifiable compliance evidence #2208 implied a structured object; Unicode normalization form is unspecified. Once action_ref ships it's a breaking compliance change to alter, so let's nail this down. Recommend either RFC 8785 JCS or deterministic CBOR with explicit field tags, rather than ad-hoc concatenation.
verify() -> bool is too thin to support the three CLI exit codes (0/1/2) you specify. Suggest a result type or typed exceptions distinguishing not-found / hash-mismatch / backend-unavailable. Also: where does the inclusion proof live for Merkle-log backends? Specify per-backend rather than leaving it in metadata: dict[str, Any].
"No network access to operator infrastructure" ≠ "fully offline." Rekor / chain / WORM verification all require network access to the anchor backend. One clarifying sentence avoids misunderstanding.

Governance

Reference backend list shouldn't include a specific commercial/personal product. S3 Object Lock and Sigstore Rekor are infrastructure primitives; Mycelium Trails is your project. Suggest keeping WORM + Rekor as in-tree references and replacing "Priority 3 — Mycelium Trails" with a generic "on-chain anchor (see community plugins)" entry. A Mycelium plugin is welcome as a separate community repo once the core interface lands.
Normative dependencies on external/personal repos (azender1/SafeAgent, giskard09/argentum-core#7) — please inline the relevant canonicalization rules into this proposal, or mark those refs as informational/non-normative. AGT's on-disk format can't take a normative dependency on third-party repos.
Minor: **Author:** @giskard09 (Rama / Mycelium) — please drop the affiliation from the doc header (keep it in the PR description). Standard practice for vendor-neutral toolkits.

Spec gaps

Anchor failure semantics at write time — block the action? buffer and retry? fail-open? This is a critical behavioral contract for regulated deployments.
Batching — per-record anchoring is expensive on Rekor and prohibitive on-chain. At minimum acknowledge batched-Merkle-root anchoring as a v2 path so the v1 interface doesn't preclude it.
Append-only / monotonic-time conformance requirement for backends. Without this, an operator could rewrite-and-re-anchor. Should be a stated requirement on any conformant EvidenceAnchor, not implicit.
Plugin discovery — flag this as a security surface, not just an open question. Suggest explicit registration as the default; entry-point auto-discovery opt-in only.
Receipt-vs-raw-evidence: does the anchor cover the signed receipt from verifiable-compliance-receipts.md or the raw evidence? Different audit guarantees — needs one sentence.
Compliance table — please soften "satisfies" → "supports" throughout, especially for EU AI Act Art. 12 (Art. 12 requires automatic logging; tamper-evidence is supportive, not literally required). Auditors read this language strictly.

Would strengthen the proposal

An Impacts section: latency added to the action path, per-anchor cost (Rekor ~free; on-chain $0.001–$0.10/tx), evidence-file growth, availability coupling to anchor backend, security surface from plugin loading.
Two Mermaid diagrams — write path and audit-verify path. The whole point of this proposal is the trust boundary, and a diagram makes it visible at a glance. Happy to suggest specific ones if useful.

Thank you, thi s is a very promising direction, addresses a real gap, just needs the canonicalization tightened, the vendor-neutrality cleaned up, and the operational impacts spelled out before this is mergeable as a design baseline.

…e.json Adds design proposal for a backend-agnostic EvidenceAnchor interface, canonical action_ref derivation spec, agt-evidence.json schema extension, and CLI changes for agt verify --anchor. References issue microsoft#2208. Signed-off-by: giskard09 <playplay2736@gmail.com>

giskard09 · 2026-05-14T10:01:20Z

Thank you for the detailed review — this is exactly the kind of signal that makes a proposal shippable. Working on v2 now.

Quick alignment check on two design decisions before I push:

For action_ref canonicalization: RFC 8785 JCS over the current ad-hoc approach — agree this is the right call. Any preference on whether timestamp_ms stays as int64 big-endian (current APS wire format) or gets serialized as a number in the JCS object?
Failure semantics: proposing fail-open (log-and-continue) as the default with fail-closed as a config option for regulated deployments. Does that match AGT's behavioral contract expectations, or do you prefer fail-closed as the default with explicit opt-out?

Everything else in your review is clear — will address all blocking, governance, and spec gap items in v2.

Ricky-G · 2026-05-14T10:29:32Z

Both good questions — quick takes:

1. timestamp_ms under JCS

Drop the int64-big-endian rule entirely; it can't coexist cleanly with JCS without re-introducing the drift we're trying to eliminate. Two options that both work:

Preferred: RFC 3339 string, UTC only, mandatory Z, fixed 3-digit millisecond precision ("2026-05-13T10:00:00.123Z"). Future-proof against ns/μs precision later, matches Sigstore/Rekor conventions (which helps your Priority 2 backend), human-readable in the evidence file.
Acceptable: keep it as a JSON number named timestamp_ms, but explicitly forbid values exceeding 2^53−1 and forbid sub-millisecond precision in v1. JCS will serialize it deterministically inside that range.

Either is fine; just pick one and remove the binary wire format.

2. Failure semantics

I'd push back on fail-open as the default. AGT's positioning (governance, zero-trust, OWASP Agentic Top 10) and the regulatory framing in your own proposal (EU AI Act Art. 12, FCA SYSC 9.1) both argue against silently-incomplete audit trails. Comparable tools (OPA, Sigstore policy-controller, Kyverno) all default to enforce/deny on the enforcement path. A fail-open default in a Microsoft-owned governance toolkit is the kind of thing that gets flagged in security review.

Suggest a 3-mode config instead of a binary, since the real-world pattern doesn't fit neatly into either:

enforce (default) — action blocks if the anchor write fails. Fail-closed.
queue — action proceeds; the anchor request is written to a durable local queue and retried with backoff; the evidence record is tagged anchor_status: "pending" and the auditor's verify flags it. This is what most high-throughput regulated systems actually want, since it preserves availability without dropping the audit guarantee.
best_effort — log-and-continue. Explicit opt-in for dev / non-regulated use. Records marked anchor_status: "skipped".

Hard rule regardless of mode: any record not successfully anchored MUST be marked in the evidence file (anchor_status: "pending" | "failed" | "skipped"). Silent fail-open is the actually-bad outcome — making the gap visible to the auditor is the non-negotiable part.

Aslo another point

Observability — please make this explicit in v2

Don't leave "anchor failure emits a log/metric" implicit. Spec it:

Anchor failures (any mode) MUST emit a structured event on AGT's existing telemetry channels — not only as an anchor_status field on the evidence record. The evidence file alone is not a sufficient failure-detection surface (it's the artifact the control was supposed to protect; relying on it to also report its own failure is circular, and an attacker who can suppress evidence writes hides the failure).
Suggest concrete names so operators can alert on them consistently:
Log event: agt.evidence.anchor.failed with fields {backend, action_ref, error_class, attempt}
Metrics: agt_evidence_anchor_failures_total{backend,reason} counter, agt_evidence_anchor_pending gauge (for queue mode backlog), agt_evidence_anchor_latency_seconds histogram
OpenTelemetry span on the anchor call, with the failure recorded as a span event so it shows up in existing traces
Whatever the existing AGT logging conventions are, anchor events should follow them rather than inventing a new format.
This is what bridges "the proposal" to "what an SRE deploys and a SOC can alert on" — without it, every operator reinvents it differently and the audit story fragments.

Looking forward to v2.

Blocking technical: - action_ref canonicalization: replace ad-hoc concatenation with RFC 8785 JCS - timestamp: RFC 3339 UTC string (3-digit ms), drop int64 big-endian - verify() returns AnchorVerifyResult with typed status enum (verified / not_found / hash_mismatch / backend_unavailable) + optional InclusionProof - Add clarification: verification requires network access to anchor backend, not to operator infrastructure Governance: - Remove Priority 3 Mycelium Trails from reference backend list; replace with generic "on-chain anchor (community plugins)" entry - Mark azender1/SafeAgent and argentum-core#7 refs as [informational, non-normative]; inline normative canonicalization rules in proposal body - Drop "Rama / Mycelium" affiliation from doc header Spec gaps: - Failure semantics: 3-mode config (enforce default / queue / best_effort) with hard rule: any unanchored record MUST carry anchor_status - Batching: acknowledge v2 Merkle-root path; v1 interface accommodates it - Append-only conformance requirement made explicit - Plugin discovery: explicit registration default; auto-discovery opt-in with documented security surface - Receipt vs raw evidence: one-sentence clarification - Compliance table: "satisfies" → "supports" throughout; EU AI Act note added Additions: - Observability section: log event agt.evidence.anchor.failed, 3 metrics, OTel span — following AGT telemetry conventions - Impacts section: latency, cost, file growth, availability coupling, security surface - Two Mermaid diagrams: write path and audit-verify path Signed-off-by: giskard09 <playplay2736@gmail.com>

giskard09 · 2026-05-14T12:12:17Z

v2 pushed — addresses all three points:

timestamp_ms → RFC 3339 string (UTC, Z, 3-digit ms), int64 big-endian removed throughout
Failure semantics → 3-mode config (enforce default / queue / best_effort) with hard rule: any unanchored record MUST be marked anchor_status
Observability section added: agt.evidence.anchor.failed log event, three metrics (failures_total, pending gauge, latency histogram), OTel span on anchor call

All other blocking/governance/spec-gap items from your first review also incorporated. Ready for another pass when you have time.

… miyannishar - Mycelium Trails moved out of reference backends to community plugin path; proposal now ships WORM + Sigstore Rekor as in-tree references only - azender1/SafeAgent and argentum-core#7 moved to explicit "Related work" section, marked informational - Forward-looking anchor_batch() note added at end of EvidenceAnchor interface section to signal v1 extensibility without breaking changes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

giskard09 · 2026-05-14T14:52:12Z

Thanks for the thorough review — v3 pushed.

Changes from v2:

Mycelium Trails moved out of the reference backends list and into the community plugins path. The proposal now ships with WORM + Sigstore Rekor only as in-tree references.
azender1/SafeAgent and argentum-core#7 references moved to an explicit "Related work" section at the end of the document, marked informational.
Added a forward-looking note on anchor_batch() at the end of the EvidenceAnchor interface section to keep v1 extensible without breaking changes.
"satisfies" → "supports" throughout the compliance table (EU AI Act Art. 12, FCA SYSC 9.1).

Let me know if anything else needs adjustment before this is ready to land.

giskard09 mentioned this pull request May 13, 2026

agt-evidence.json — external anchor for independently verifiable compliance evidence #2208

Closed

github-actions Bot added the documentation Improvements or additions to documentation label May 13, 2026

github-actions Bot added the size/M Medium PR (< 200 lines) label May 13, 2026

github-actions Bot added the needs-review:HIGH Contributor reputation check flagged HIGH risk label May 13, 2026

imran-siddique enabled auto-merge (squash) May 14, 2026 06:20

Ricky-G force-pushed the proposal/mycelium-external-anchor branch from 1e64627 to 1b11d83 Compare May 14, 2026 09:09

auto-merge was automatically disabled May 14, 2026 09:58
Head branch was pushed to by a user without write access

giskard09 force-pushed the proposal/mycelium-external-anchor branch from 1b11d83 to 9f475bd Compare May 14, 2026 09:58

github-actions Bot added size/L Large PR (< 500 lines) and removed size/M Medium PR (< 200 lines) labels May 14, 2026

miyannishar mentioned this pull request May 14, 2026

feat: Docs/governance event sink spi proposal #2240

Open

Merge branch 'main' into proposal/mycelium-external-anchor

de376cb

giskard09 mentioned this pull request May 15, 2026

Post-settlement accountability layer: tamper-evident proof of agent action after payment x402-foundation/x402#2332

Open

Conversation

giskard09 commented May 13, 2026

What this adds

Relationship to verifiable-compliance-receipts.md

Cross-system compatibility

Uh oh!

github-actions Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

evidence_anchor.py

cli_verify.py

agt_evidence_schema.py

Uh oh!

github-actions Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

API Compatibility

Uh oh!

github-actions Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Docs Sync

Uh oh!

github-actions Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Action Items:

Warnings (fine as follow-up PRs):

Uh oh!

github-actions Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

Uh oh!

giskard09 commented May 13, 2026

Uh oh!

Ricky-G commented May 14, 2026

Blocking (technical)

Governance

Spec gaps

Would strengthen the proposal

Uh oh!

giskard09 commented May 14, 2026

Uh oh!

Ricky-G commented May 14, 2026

Uh oh!

giskard09 commented May 14, 2026

Uh oh!

giskard09 commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented May 13, 2026 •

edited

Loading

`evidence_anchor.py`

`cli_verify.py`

`agt_evidence_schema.py`

github-actions Bot commented May 13, 2026 •

edited

Loading

github-actions Bot commented May 13, 2026 •

edited

Loading

github-actions Bot commented May 13, 2026 •

edited

Loading

github-actions Bot commented May 13, 2026 •

edited

Loading

github-actions Bot commented May 13, 2026 •

edited

Loading